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INTRODUCTION 


It  has  been  well  recognized  that  merging  information  from  different  imaging  modalities, 
such  as  mammography,  sonography  and  magnetic  resonance  imaging  (MRI),  will  greatly 
benefit  the  diagnosis  of  breast  cancer  [1-4],  as  well  as  contribute  to  the  assessment  of 
tumor  response  and  image-guided  therapy.  However,  interpreting  images  from  different 
modalities  is  not  trivial  as  different  images  of  the  same  lesion  may  exhibit  different 
physical  lesion  characteristics,  and  the  image  acquisitions  are  performed  under  different 
breast  positioning  protocols.  Also,  the  breast  is  a  non-rigid  object,  and  thus  conventional 
image  registration  methods  are  not  appropriate.  So  the  primary  problem  of  merging 
image  information  from  different  modalities  is  to  address  the  task  of  identifying 
corresponding  images  of  lesions  as  seen  with  different  imaging  techniques.  The  purpose 
of  this  research  is  to  develop  correlative  feature  analysis  methods  for  integrating  image 
information  from  multi-modality  breast  images,  taking  advantage  of  the  information  from 
different  views  and/or  different  modalities,  and  thus  improving  the  sensitivity  and 
specificity  of  breast  cancer  diagnosis.  A  novel  aspect  of  the  proposed  research  is  the 
development  of  correlative  feature  analysis  (CFA)  into  the  decision-making  process.  Our 
hypothesis  is  that  the  proposed  correlative  feature  analysis  can  benefit  computerized 
corresponding  image  analysis,  thus  help  the  radiologist  efficiently  distinguish  between 
corresponding  and  non-corresponding  lesion  pairs.  This  report  summarizes  the  progress 
of  this  Predoctoral  Traineeship  Award  project  made  by  the  recipient  during  the  first  year. 
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BODY 


Training  Accomplishments 

At  the  time  of  this  report,  the  recipient,  Yading  Yuan,  of  the  Predoctoral  Traineeship 
Award  has  taken  21  out  of  the  22  required  courses  towards  the  Ph.D.  degree  in  medical 
physics.  The  remaining  one  course  will  be  taken  in  2007,  Fall.  The  courses  include 
physics  of  medical  imaging,  physics  of  radiation  therapy,  mathematics  for  medical 
physicists,  image  processing,  statistics,  machine  learning,  numerical  computation, 
computer  vision,  anatomy  of  the  body,  radiation  biology,  and  teaching  assistant  training. 

Research  Accomplishments 

1.  Database  collection 

The  first  part  of  our  work  has  been  collecting  a  multi-modality  image  database  from  the 
University  of  Chicago  Hospitals,  which  includes  full-field  digital  mammographic 
(FFDM)  images,  breast  ultrasound  (US)  images  and  breast  magnetic  resonance  (MR) 
images.  The  FFDM  database  consists  of  148  malignant  and  139  benign  lesions.  All  the 
images  were  obtained  from  GE  Senographe  2000D  systems  with  a  spatial  resolution  of 
95pmx95pm.  The  US  database  consists  of  195  malignant  solid  lesions,  77  simple  cysts, 
25  fibrocystic  nodules  and  109  benign  solid  lesions.  The  US  images  were  obtained  with  a 
Philips  HDI  5000  US  unit  and  a  12-5MHz  linear  array  probe.  The  pixel  size  varied  from 
53  pm  to  212  pm,  with  the  average  value  of  114  pm.  The  MR  database  consists  of  97 
malignant  and  84  benign  lesions.  The  MR  images  were  obtained  from  1.5T  GE  scanners 
using  Tl-weighted  3D  spoiled  gradient  echo  sequences.  For  each  case,  one  pre-contrast 
and  five  post-contrast  series  were  taken  and  each  series  contained  60  coronal  slices  with  a 
range  of  planar  spatial  resolution  from  1.25xl.25mm~  to  1.6x1.6mm".  Slice  thickness 
ranged  from  3  to  4  mm  depending  on  breast  size.  All  the  cases  in  the  multi-modality 
database  were  identified  by  expert  breast  radiologists  based  on  visual  criterion  and  either 
biopsy  or  aspiration  proven  reports. 

Based  on  the  FFDM  database,  we  constructed  123  corresponding  image  pairs  and  82  non¬ 
corresponding  pairs.  Each  pair  consists  of  a  craniocaudal  (CC)  view  and  a  mediolateral 
(ME)  view.  Considering  the  most  realistic  scenario  of  lesion  mismatch  in  clinical 
practice,  the  non-corresponding  pairs  were  constructed  from  cases  of  the  same  patients 
but  different  physical  lesions.  Since  in  our  database  the  number  of  patients  having  two  or 
more  lesions  in  the  same  breast  is  limited,  the  non-corresponding  dataset  included  all 
possible  lesion  combinations  from  different  views. 

With  the  whole  multi-modality  database,  we  also  constructed  a  dataset  with  112  cases 
having  both  mammography  and  sonography.  By  incorporating  MR  images,  there  are  88 
cases  having  all  the  three  modality  images  so  far.  We  are  currently  having  radiologists 
determine  the  correspondence  of  lesions  appeared  in  different  modality  images. 

2.  Investigation  of  lesion  segmentation 
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Mass  lesion  segmentation  on  mammograms  is  a  challenging  task  since  mass  lesions  are 
usually  embedded  and  hidden  in  varying  densities  of  parenchymal  tissue  structures.  We 
have  developed  a  dual-stage  method  for  automatic  delineation  of  lesion  boundaries  on 
FFDM  images.  This  method  utilizes  a  geometric  active  contour  model  that  minimizes  an 
energy  function  based  on  the  homogeneities  inside  and  outside  of  the  evolving  contour. 
Prior  to  the  application  of  the  active  contour  model,  a  radial  gradient  index  (RGI)  based 
segmentation  method  is  applied  to  yield  an  initial  contour  closer  to  the  lesion  boundary 
location  in  a  computationally  efficient  manner.  Based  on  the  initial  segmentation,  an 
automatic  background  estimation  method  is  applied  to  identify  the  effective  circumstance 
of  lesion,  and  a  dynamic  stopping  criterion  is  implemented  to  terminate  the  contour 
evolution  when  it  reaches  the  lesion  boundary.  By  using  the  FFDM  database  described 
above,  we  quantitatively  compare  the  proposed  algorithm  with  a  conventional  region¬ 
growing  method  and  a  RGI-based  algorithm  by  use  of  the  area  overlap  ratio  between 
computer  segmentation  and  manual  segmentation  by  an  expert  radiologist.  At  an  overlap 
threshold  of  0.4,  85%  of  the  images  are  correctly  segmented  with  the  proposed  method, 
while  only  69%  and  73%  of  the  images  are  correctly  delineated  by  our  previous 
developed  region-growing  and  RGI  method.  A  full  description  of  the  method  is  in 
reference  [5]  which  is  attached  as  Appendix  A. 

3.  Investigation  of  feature  correlation 

We  evaluated  the  correlation  performance  of  individual  computerized  features  extracted 
from  the  FFDM  images  of  a  lesion  obtained  in  CC  and  ML  views.  In  order  to  evaluate  the 
robustness  of  the  correlation  performance  to  lesion  segmentation,  besides  the 
radiologist’s  outlines,  three  automatic  segmentation  methods  were  employed  to  extract 
the  mass  lesion  from  the  surrounding  tissues,  which  includes  a  conventional  region¬ 
growing  method,  a  RGI-based  method  and  the  newly-developed  dual-stage  segmentation 
method.  15  computer-extracted  features  of  each  lesion  were  calculated  in  both  views  in 
order  to  quantify  the  characteristics  of  margin,  shape,  contrast  and  texture  of  the  lesion. 
For  each  feature,  correlation  coefficient  between  the  two  views  and  the  p-value  of  the 
derived  correlation  coefficient  were  obtained.  Our  results  show  that  the  features 
characterizing  shape,  contrast  and  texture  performed  better  among  the  15  individual 
features  despite  of  segmentation  methods  and  pathology.  This  is  because  the  features 
representing  large-scale  information  are  less  sensitive  to  the  change  of  position  than  those 
representing  small-scale  information,  which  results  in  the  higher  correlation  between 
large-scale  features  from  different  views  than  that  of  small-scale  features.  This  work 
provides  a  guide  for  discriminating  corresponding  and  non-corresponding  lesion  pairs 
within  the  CAD  framework.  It  is  also  helpful  for  guiding  the  development  of  new 
features  to  improve  the  accuracy  of  image  matching  in  disease  diagnosis  and  prognosis. 
A  more  detailed  summary  can  be  found  in  reference  [6],  which  is  also  attached  as 
Appendix  B. 

Mutual  information  (MI)  is  another  measure  of  the  dependence  between  two  variables.  It 
is  well  understood  that  mutual  information  measures  the  general  dependence,  while  the 
correlation  coefficient  measures  the  linear  dependence.  So  we  also  investigated  the 
mutual  information  among  the  features  and  assessed  its  effect  on  the  choice  of 
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discriminating  features  as  compared  with  the  use  of  linear  correlation  coefficient  between 
features.  For  each  feature  described  above,  mutual  information  between  the  two  views 
was  obtained  using  a  density  estimation  method  (e.g.,  Parzen  windows).  However,  the 
dependence  rank  of  features  determined  by  mutual  information  highly  agreed  with  that 
determined  by  linear  correlation  coefficient,  yielding  a  correlation  coefficient  of  0.87. 
This  result  indicated  that  linear  correlation  coefficient  is  a  good  metric  to  represent  the 
dependence  between  features  from  different  views.  Moreover,  since  linear  correlation 
coefficient  is  bounded  to  [-1,1],  we  will  use  linear  correlation  coefficient  as  the  metric  to 
choose  the  discriminating  features. 

4.  Development  of  new  computerized  features 

Since  features  characterizing  large-scale  information  usually  have  better  correlation 
performance,  we  developed  two  sets  of  “large-scale”  features.  Firstly,  we  extracted  a  set 
of  texture  features  based  on  a  gray-level  co-occurrence  matrix  (GLCM).  For  each  region, 
four  GLCMs  were  constructed  along  four  different  directions  of  0°,  45°,  90°  and  135°. 
Assuming  that  there  is  no  directional  texture  features  in  mammograms,  a  non-directional 
GLCM  was  obtained  by  summing  all  the  directional  GLCMs.  Texture  features  were  then 
computed  from  each  non-directional  GLCM.  To  avoid  sparse  GLCMs  for  smaller  lesions, 
the  gray  level  range  of  image  was  scaled  down  to  6  bits,  resulting  in  GLCM  of  size 
64x64.  Among  the  texture  features,  correlation  feature  performed  best  with  a  correlation 
coefficient  of  0.67  (p-value  <  10 3). 

In  clinic  practice,  radiologists  commonly  use  the  distance  from  nipple  to  the  center  of  a 
lesion  to  correlate  the  lesion  in  different  views.  It  is  generally  believed  that  this  distance 
keeps  fairly  constant.  Thus,  we  developed  a  distance  feature  to  measure  the  Euclidean 
distance  between  the  nipple  location  and  the  mass  center  of  lesion.  We  also  developed  an 
automatic  nipple  localization  scheme  to  tracking  nipple  markers  on  each  FFDM  images. 
With  computer-identified  nipples,  the  distance  features  in  CC  views  are  highly  correlated 
with  those  in  ML  views,  yielding  a  correlation  coefficient  of  0.88  (p-value  <  10'  ). 

5.  Evaluation  of  the  performance  of  computerized  features  for  the  task  of 
distinguishing  corresponding  image  pairs  and  non-corresponding  ones 

We  used  the  FFDM  database  to  evaluate  the  performance  of  computerized  features  for 
the  task  of  distinguishing  corresponding  and  non-corresponding  image  pairs  from  CC  and 
ML  views  [7].  17  features  that  were  automatically  extracted  from  the  lesions  could  be 
grouped  into  three  categories:  (I)  density  and  morphological  features;  (II)  texture  features 
and  (III)  distance  feature.  A  stepwise  feature  selection  procedure  was  employed  to  select 
an  effective  subset  of  features,  which  were  then  combined  by  Bayesian  artificial  neural 
networks  (BANN)  to  obtained  a  discriminant  score,  yielded  an  estimate  of  the  probability 
that  the  two  images  are  of  the  same  physical  lesion.  Receiver  characteristic  (ROC) 
analysis  was  used  to  evaluate  the  classification  performance  of  the  individual  features  and 
the  selected  feature  subset.  The  distance  feature  yielded  an  AUC  (area  under  the  ROC 
curve)  of  0.81  with  leave-one-out  cross-validation,  and  the  feature  subset  with  3  features 
yielded  an  AUC  of  0.86.  The  preliminary  study,  which  includes  124  corresponding  and 
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35  non-corresponding  image  pairs,  has  been  submitted  to  SPIE  Medical  Imaging 
Conference,  2008.  The  abstract  is  attached  as  Appendix  C. 
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KEY  RESEARCH  ACCOMPLISHMENTS 


•  Collected  and  maintained  a  multi-modality  database  including  full-field  digital 
mammograms,  breast  ultrasound  images  and  breast  MR  images.  More  than  180 
lesions  were  collected  for  each  modality,  which  is  suitable  for  the  further  correlative 
feature  analysis  across  image  modalities. 

•  Developed  a  dual-stage  lesion  segmentation  method  for  FFDM  images,  which 
outperformed  the  performances  of  our  previous  developed  region-growing  method 
and  the  RGI-based  segmentation  method. 

•  Investigated  feature  correlation  with  both  linear  correlation  coefficient  and  mutual 
information.  The  results  demonstrate  that  the  features  representing  large-scale 
information  of  lesions  usually  have  better  correlation  performance  and  linear 
correlation  coefficient  is  an  appropriate  metric  characterizing  the  dependence 
between  features  from  different  views. 

•  Developed  texture  features  and  distance  feature,  which  have  been  proven  to  be  useful 
in  differentiating  corresponding  and  non-corresponding  image  pairs. 

•  Evaluated  the  performance  of  computerized  features  for  the  task  of  distinguishing 
corresponding  and  non-corresponding  image  pairs.  The  selected  feature  subset 
yielded  an  AUC  of  0.86  with  leave-one-out  cross-validation. 


9 


REPORTABLE  OUTCOMES 


Peer-reviewed  Journal  Papers 

•  Y.  Yuan,  M.  L.  Giger,  H.  Li,  K.  Suzuki  and  C.  Sennett,  “A  dual-stage  method  for 
lesion  segmentation  on  digital  mammograms”,  Med.  Phys,  (In  press),  2007. 

Conference  Proceeding  Papers 

•  M.  L.  Giger,  Y.  Yuan,  H.  Li,  K.  Drukker,  W.  Chen,  L.  Lan  and  K.  Horsch,  “Progress 
in  breast  CADx,  ”  Biomedical  imaging:  From  Nano  to  Macro,  2007.  ISBI  2007.  4th 
IEEE  International  Symposium  on,  Arlington,  Virginia,  2007 

•  H.  Li,  M.  L.  Giger,  Y.  Yuan,  L.  Lan,  K.  Suzuki,  A.  Jamieson  and  C.  Sennett, 
“Comparison  of  computerized  image  analyses  for  digitized  mammograms  and  FFDM 
images,  ”  International  Workshops  on  Digital  Mammography,  Manchester,  United 
Kingdom,  2006 

Conference  Presentations  and  Abstracts 

•  Y.  Yuan,  M.  L.  Giger,  H.  Li  and  C.  Sennett,  “Correlative  feature  analysis  of  FFDM 
images”,  submitted  to  SPIE  Medical  Imaging  Conference,  2008. 

•  Y.  Yuan,  M.  L.  Giger,  H.  Li  and  C.  Sennett,  “Computer-based  feature  correlation  on 
multiple-view  FFDM  images”,  Radiological  Society  of  North  America,  Chicago, 
Illinois,  2006 
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CONCLUSIONS 


The  recipient  of  the  Predoctoral  Traineeship  Award  has  taken  all  the  required  core 
courses  and  many  research  related  elective  courses  as  well.  These  trainings  have  proven 
useful  for  the  recipient  to  achieve  the  proposed  research  goals. 

During  the  first  year,  we  have  collected  and  maintained  a  multi-modality  database 
including  full-field  digital  mammograms,  breast  ultrasound  images  and  breast  MR 
images,  which  is  suitable  for  the  proposed  research  on  correlative  feature  analysis  for 
multi-modality  images.  We  have  developed  computerized  methods  for  lesion 
segmentation,  feature  extraction  and  selection,  feature  correlation  analysis  and  image  pair 
classification  in  differentiating  corresponding  and  non-corresponding  FFDM  image  pairs 
from  CC  and  ML  views,  respectively.  The  results  have  shown  that  our  computerized 
feature  correlative  analysis  has  great  potential  in  identifying  the  corresponding  image  pair 
of  a  lesion  for  FFDM  images. 

Overall,  we  have  achieved  the  goals  for  the  first  year  and  laid  down  a  good  foundation 
for  the  research  in  the  next  two  years.  Our  goals  in  the  next  two  years  include  collection 
of  more  image  data,  development  of  feature  selection  method  based  on  mutual 
information  and  compare  it  with  stepwise  feature  selection  and  genetic  algorithm-based 
feature  selection  methods,  investigation  of  features  that  would  have  better  correlation 
between  image  pairs  across  different  image  modalities,  and  evaluation  of  the  proposed 
feature  correlative  analysis  with  the  whole  multi-modality  database. 
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Abstract 


Mass  lesion  segmentation  on  mammograms  is  a  challenging  task  since  mass  lesions 
are  usually  embedded  and  hidden  in  varying  densities  of  parenchymal  tissue  structures. 
In  this  paper,  we  present  a  method  for  automatic  delineation  of  lesion  boundries  on 
digital  mammograms.  This  method  utilizes  a  geometric  active  contour  model  that 
minimizes  an  energy  function  based  on  the  homogeneities  inside  and  outside  of  the 
evolving  contour.  Prior  to  the  application  of  the  active  contour  model,  a  radial  gradient 
index  (RGI)  based  segmentation  method  is  applied  to  yield  an  initial  contour  closer 
to  the  lesion  boundary  location  in  a  computationally  efficient  manner.  Based  on  the 
initial  segmentation,  an  automatic  background  estimation  method  is  applied  to  identify 
the  effective  circumstance  of  lesion,  and  a  dynamic  stopping  criterion  is  implemented 
to  terminate  the  contour  evolution  when  it  reaches  the  lesion  boundary.  By  using  a 
full-field  digital  mammography  database  with  739  images,  we  quantitatively  compare 
the  proposed  algorithm  with  a  conventional  region-growing  method  and  a  RGI-based 
algorithm  by  use  of  the  area  overlap  ratio  between  computer  segmentation  and  manual 
segmentation  by  an  expert  radiologist.  At  an  overlap  threshold  of  0.4,  85%  of  the 
images  are  correctly  segmented  with  the  proposed  method,  while  only  69%  and  73% 
of  the  images  are  correctly  delineated  by  our  previous  developed  region-growing  and 
RGI  methods,  respectively.  This  resulting  improvement  in  segmentation  is  statistically 
significant. 

Key  words:  Mass  lesion  segmentation,  geometric  active  contour  model,  computer- 
aided  diagnosis,  breast  cancer 
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I.  INTRODUCTION 


Breast  cancer  is  the  most  common  malignancy  in  American  women  and  the  second  most 
common  cause  of  death  from  malignancy  in  this  population.  According  to  the  American 
Cancer  Society,  about  178,480  women  in  the  United  States  will  be  found  to  have  invasive 
breast  cancer  in  2007,  and  about  40,460  women  will  die  from  the  disease  this  year  [1], 
Although  some  imaging  modalities,  such  as  magnetic  resonance  imaging  (MRI)  [2] [3]  and 
sonography [4]  [5],  are  currently  being  investigated  to  improve  sensitivity  and  specificity  of 
breast  cancer  diagnosis,  X-ray  mammography  is  still  the  most  prevalent  imaging  procedure 
for  the  early  detection  of  breast  cancer. 

Lesion  segmentation,  which  extracts  the  lesion  from  the  surrounding  tissues,  is  an  essen¬ 
tial  step  in  the  computerized  analysis  of  mammograms.  As  mass  lesions  are  usually  embedded 
and  hidden  in  varying  densities  of  parenchymal  structures,  the  task  of  lesion  segmentation 
is  not  trivial.  Many  researchers  have  developed  computer  algorithms  for  this  task.  Huo  et 
al.  [6]  employed  a  region-growing  method  to  find  the  contour,  in  which  abrupt  changes  in 
size  and  circularity  were  used  as  the  rules  of  segmentation.  Ivupinski  et  al.  [7]  segmented 
the  mass  by  applying  either  a  radial  gradient  index  (RGI)  model  or  a  probabilistic  model  to 
the  lesion,  multiplied  by  a  constraint  function.  Petrick  et  al.  [8]  introduced  a  segmentation 
algorithm  that  combines  a  density-weighted  contrast  enhancement  filter  and  a  region  grow¬ 
ing  method.  Li  et  al.  [9]  employed  a  multiresolution  Markov  random  field  model  to  detect 
tumors  in  mammographic  images.  Timp  et  al.  [10]  employed  both  edge  based  information 
as  well  as  a  priori  knowledge  about  the  grey  level  distribution  of  the  region  of  interest  (ROI) 
around  the  mass,  and  obtained  an  optimal  contour  using  dynamic  programming.  To  segment 
lesions,  Guliato  et  al.  [11]  proposed  two  fuzzy  sets  related  methods  -  one  employing  a  region 
growing  after  fuzzy-sets-based  pre-processing,  and  the  other  using  a  fuzzy  region-growing 
method  that  takes  into  account  the  uncertainty  present  around  the  boundaries  of  tumor. 
Li  et  al.  [12]  presented  a  statistical  model  for  enhanced  segmentation  and  extraction  of  a 
suspicious  mass  area  from  mammographic  images.  In  their  study,  a  morphological  operation 
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is  derived  to  enhance  disease  patterns  of  suspected  masses  by  eliminating  unrelated  back¬ 
ground  clutter,  and  a  model-based  image  segmentation  is  performed  to  localize  the  suspected 
mass  areas  using  stochastic  relaxation  labeling. 

Originally  introduced  by  Kass  [13],  active  contour  models  (or  snakes)  have  attracted  much 
attention  as  image  segmentation  techniques.  An  active  contour  model  minimizes  an  energy 
functional  along  a  deformable  contour,  which  is  influenced  by  both  internal  and  external 
terms.  The  internal  energy  controls  the  smoothness  and  elasticity  of  the  contour,  while 
the  external  energy  attracts  the  evolving  contour  to  deform  toward  salient  image  features, 
such  as  edges.  Although  the  active  contour  model  has  been  used  for  segmenting  objects  in 
a  wide  range  of  medical  applications  [14]  [15]  [16]  [17]  [18]  [19],  to  the  best  of  our  knowledge, 
few  works  have  applied  this  model  to  the  task  of  lesion  segmentation  in  mammographic 
images.  Brake  et  al.  [20],  segmented  mass  lesions  by  a  discrete  active  contour  method 
whose  external  energy  was  determined  by  the  image  gradient  magnitude.  Sahiner  et  al.  [21] 
applied  an  active  contour  model  that  incorporated  edge  and  region  analysis,  in  which  the 
contour  energy  was  minimized  by  a  greedy  algorithm.  In  their  work,  however,  the  contour 
was  represented  by  the  vertices  of  an  N-points  polygon  and  each  vertex  was  tracked  during 
the  process,  which  makes  it  difficult  for  the  contour  to  adapt  to  a  change  of  topology,  such 
as  splitting  or  merging  parts. 

Differing  from  the  segmentation  methods  mentioned  above,  in  this  study,  we  develop  an 
automatic  lesion  segmentation  algorithm  that  employs  a  geometric  active  contour  model  to 
extract  lesions.  Geometric  active  contour  models  [22]  [23]  represent  contours  as  a  level  set 
of  a  higher-dimensional  scalar  functioned].  The  contours  are  obtained  only  after  complete 
evolution,  thereby  allowing  the  model  to  handle  the  topological  changes  naturally.  As  mass 
lesions  usually  have  weak  edges,  we  use  a  region-based  active  contour  model  [25]  that  is  based 
on  global  image  information,  and  is  less  sensitive  to  noise  and  the  initial  contour.  In  order  to 
improve  the  computational  efficiency  and  suppress  the  influence  of  unrelated  structures,  our 
previous  RGI-based  segmentation  method[7]  is  applied  first  to  delineate  an  initial  contour, 
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which  is  relatively  close  to  the  actual  margin,  and  to  estimate  the  effective  background.  We 
then  exploit  a  dynamic  stopping  criterion,  which  is  solely  based  on  the  property  of  the  given 
image,  to  terminate  the  evolving  procedure  automatically. 

The  organization  of  this  paper  is  as  follows:  Section  2  introduces  the  database  used  for 
this  study.  Section  3  describes  the  proposed  segmentation  method.  Section  4  presents  the 
results,  and  Section  5  and  6  give  a  discussion  and  conclusion,  respectively. 

II.  MATERIALS 

In  this  study,  we  used  a  full-field  digital  mammography  (FFDM)  database,  which  consists 
of  139  benign  (327  mammograms)  and  148  malignant  (412  mammograms)  lesions.  All  the 
images  were  collected  from  the  University  of  Chicago  Hospitals  (UCH)  and  obtained  from 
GE  Senographe  2000D  systems  (GE  Medical  Systems,  Milwaukee,  WI)  with  a  spatial  res¬ 
olution  of  95  firn  x  95  fim.  The  masses  were  identified  and  outlined  by  an  expert  breast 
radiologist  based  on  visual  criterion  and  biopsy-proven  reports.  These  outlines  were  used  as 
the  “gold  standard”  for  calibrating  parameters  and  evaluating  performace.  The  distributions 
of  effective  projection  diameter,  which  is  defined  as  the  effective  diameter  of  the  area  inside 
the  radiologist’s  manually-delineated  contours,  are  shown  in  Fig.  1. 

[Figure  1  about  here.] 


III.  METHODS 


The  main  aspects  of  the  proposed  segmentation  method  include  an  initial  RGI  segmentation[7], 
background  estimation  and  trend  correction,  and  an  active  contour  segmentation  based  on 
level  sets.  Fig.  2  shows  the  flow  chart  of  the  overall  implementation. 


[Figure  2  about  here. 
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A.  Active  contour  model 


The  active  contour  model  [25]  relies  on  an  intrinsic  property  of  image  segmentation:  for  an 
image  formed  by  two  regions,  each  segmented  region  should  be  as  homogeneous  as  possible. 
Mathematically,  this  model  can  be  expressed  by  the  following  energy  function: 


E(ci,  c2,  C )  =  y  ■  Length(C ) 

+Ai  •  /  \fo(x,  y)  -  ci\2dxdy 

J  inside(C) 

+A2  •  /  \f0(x,  y)  -  c2\2dxdy  (1) 

J  outside(C) 

where  y  >  0,  Ai,  A2  >  0  are  fixed  weight  parameters,  C  is  the  evolving  contour  and 
Length(C )  is  a  regularizing  term  that  prevents  the  final  contour  from  converging  to  a  small 
area  due  to  noise,  and  C\  and  c2  are  mean  values  inside  and  outside  of  C,  respectively.  Note 
that  many  other  active  contour  models  are  edge-based  as  opposed  to  the  grav-level  based 
method  used  here. 

Equation  (1)  can  be  represented  and  solved  by  level  set  theory  [26].  Level  set  theory,  in 
which  the  two-dimensional  evolving  contour  C  is  represented  implicitly  as  the  zero  level  set  of 
a  three-dimensional  Lipschitz  function  cft(x,  y ),  i.e.  C  =  {(x,  y)  G  :  (p(x ,  y)  =  0},  evolves 
the  contour  by  updating  the  level  set  function  (f>(x ,  y)  at  fixed  coordinates  through  iterations 
instead  of  tracking  the  contour  itself.  The  initial  level  set  function  y)  is  usually  defined 
as  the  signed  distance  function: 


(j)(x,  y\  t  =  0)  =  ±d  (2) 

where  d  is  the  distance  from  (x,  y)  to  C(t  —  0),  where  C(t  —  0)  corresponds  to  the  initial 
contour.  The  plus  (minus)  sign  is  chosen  if  the  point  (x,  y)  is  inside  (outside)  the  initial 
contour  C(t  —  0). 
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With  the  evolution  of  the  contour,  the  level  set  function  <f>  cannot  be  held  as  a  signed 
distance  function,  nor  can  it  be  kept  smooth.  In  order  to  maintain  a  smooth  level  set  function, 
and  thus  ensure  numerical  stability  of  evolution,  it  is  necessary  to  reinitialize  the  evolving 
level  set  function  to  a  signed  distance  function  periodically.  However,  reinitialization  is  a 
computationally  consuming  procedure  as  it  evolves  solving  the  partial  differential  equation 
4>t  =  sign((pt)(l—  ||  Vdt  ||),  where  V/t  corresponds  to  the  gradient  of  the  level  set  function. 
In  addition,  most  reinitializing  schemes  tend  to  move  the  contour  to  some  degree  due  to 
numerical  errors  [27] . 

A  signed  distance  function  /,  however,  has  the  intrinsic  property  that  ||  V0  ||=  1.  Thus, 
it  is  more  natural  to  incorporate  this  property  into  the  contour  evolution  instead  of  using 
the  independent  reinitializaing  procedure  described  above.  Thus,  we  can  introduce  another 
regularizing  term  [28]  in  the  active  contour  model  in  (1)  : 


E(ci,  c2,  C)  =  g  ■  Length(C ) 

+u'  \  [  (1_  II  V<^  II  fdxdy 
+Ai  •  /  |/o(7,  y)  ~  ci\2dxdy 

J  inside(C) 

+X2  •  /  \fQ(x,  y)  -  c2\2dxdy  (3) 

J  outside(C) 

where  v  is  a  weighted  parameter  and  il  represents  the  whole  image  space. 

By  replacing  C  with  cj)(x ,  y)  in  the  energy  functional  in  (3)  and  introducing  the  regularized 
versions  of  the  Heaviside  function  He(cj))  —  |[1  +  -arctan(-)] along  with  the  corresponding 
Dirac  measure  <5e(/)  =  -£-He((p)  =  e  •  [7 r  •  (e2  +  </>2)]-1,  as  given  by  Chen  and  Vese  in  [25], 
Equation  (3)  can  be  expressed  as: 
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Ee(ci,c2,<f>) 


=  /■*•/  Se(<j>(x,y ))  ||  V(j>(x,y)  ||  dxdy 


+v  ■  g  /  (1_  II  V0(x,j/)  ||)2Ghoh/ 


+Ai  •  /  |/o(x,t/)  -  ci|2iA(0(a;,?/))GU;Gh/ 

Jn 

+A2-  /  |/o(®,J/)  -  c2|2(1  -  He(<t>(x,y)))dxdy 
Jn 


(4) 


where  the  first  integral  controls  the  length  of  the  contour  and  the  second  integral  helps  to 
smooth  the  level  set  function  and  thus  avoid  the  need  for  reinitialization. 

By  fixing  c±  and  c2  and  minimizing  Ee  in  terms  of  0  at  each  iteration,  the  associated 
Euler-Lagrange  equation  can  be  derived  as: 


<M0)  •  [n  ■  k  -  Ai  •  (/0  -  ci)2  +  A2  •  (/0  -  c2)2]  +  v  •  div[{  1  -  ^  )  -V<f>]  =0  (5) 

where 

K = div{wk) 

represents  the  curvature  of  the  contour  C,  and  which  also  now  incorporates  the  regularizing- 
term  from  Li  et.  a/.  [28].  This  derivation,  combining  the  aspect  of  active  contour  without 
edges  and  level  set  without  reinitialization,  is  given  in  the  Appendix  I.  Using  the  gradient 
descent  method,  we  can  solve  0  in  Equation  (5)  iteratively  by  letting  <\>  be  a  function  of 
iteration  t  and  replace  the  zero  on  the  right-hand  side  of  (5)  by  the  time  derivative  of  0. 
Thus,  we  obtain  a  partial  differential  equation  as: 

=  <M0)  •  in  ■  K  -  Ai  •  (/o  -  Cl)2  +  A2  •  (/o  -  c2)2]  +  V  ■  div[(  1  -  ^  )  ■  V0].  (7) 

The  time  derivative  ^was  approximated  by  a  forward  finite  difference: 
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50  (f)n+1  -  (pn 


(8) 


St  At 

while  considering  the  numerical  stability  of  the  PDE  solution,  the  curvature  k  was  approxi¬ 
mated  by  a  discretizing  scheme  that  combines  both  foward  and  backward  finite  differences, 
as  suggested  in  [29]. 


where 


k  =  Ax_ 
+Ay_ 


((A^)2  +  (m( Al^j,  A y_<ftj)2y/2 

_ &y+€,j _ 

((A Iftj)2  +  (m( A*^.)2)1/2 


(9) 


and  similarly  for  A 


At  —  -F(0i=Fl,j  0ij) 


(10) 


.  f  sngia)  +  sng(b)\ 

m(a,  6)  =  (  - 1 — - j  mm(\a\,  |6|).  (11) 

B.  Contour  initialization 

The  energy  function  in  Equation  (3)  depends  on  the  evolving  curve  C  in  a  complex  way.  It 
is  not  guarantee!  to  be  quadratic  or  even  convex,  and  one  might  find  a  local  minimum  of  the 
energy  function  somewhere  in  the  neighborhood  of  the  initial  contour.  Thus,  initializing  the 
contour  is  a  non-trivial  task  for  active  contour  models.  Since  lesions’  sizes  vary,  it  is  difficult 
to  find  fixed  parameters  (such  as  the  radius  of  a  circle)  with  which  to  initialize  the  contour 
for  an  entire  database.  Hence,  we  use  our  previous  RGI-based  segmentation  method[7]  to 
estimate  the  initial  boundary  of  a  lesion. 

The  RGI-based  segmentation  algorithm [7]  incorporates  prior  knowledge  that  mass  le- 
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sions  are  roughly  compact,  and  thus,  the  original  image  f(x,  y )  is  multiplied  with  a  two- 
dimensional  constraint  function  G(x,  y;  yx,  fiy,  a2)  to  yield  a  pre-processed  image  h(x,  y ) 
as: 


h{x,  y)  =  f{x,  y)  x  G(x,  y;  nx,  yy,  a2)  (12) 

where  G(x,  y;  )ix ,  /i:y ,  cr2)is  a  Gaussian  function  centered  at  the  manually-indicated  seed 
point  ( /jlx ,  and  with  variance  a2.  The  multiplication  with  the  Gaussian  function  reduces 
the  contribution  of  structures  beyond  the  lesion,  and  thus,  a  is  set  to  15mm  to  accommodate 
most  mammographic  lesion  sizes.  We  have  found  that  the  segmentation  performance  is  not 
strongly  dependent  on  the  choice  of  a.  Larger  lesions  can  also  be  segmented  even  though 
the  small  deviations  around  the  margin  of  the  lesion  are  usually  not  delineated  well. 

Starting  from  the  given  seed  point  (/ix,  /xy) ,  a  series  of  grey  level  thresholds  are  then 
applied  to  the  pre-processed  image  h(x,  y)  to  yield  multiple  contours.  For  each  contour,  an 
RGI  value  is  calculated,  where  RGI  is  defined  as: 


RGI(fj,x,  y>y ,  C{ ) 


E  (V/i(x,y) 

(x,y)eCi 


r{x,y)  \ 
\f(x,y)\\  > 


E  II  V/i(x,y) 

(x,y)eCi 


(13) 


where  G,  is  the  set  of  points  on  the  ith  contour,  Vh(x,y )  is  the  gradient  vector  of  h(x,y ) 
at  point  (x,y),  r(x,y)/  ||  r(x,y )  ||  is  the  normalized  radial  vector,  the  direction  of  which  is 
calculated  at  position  (x,y)  with  respect  to  the  seed  point  (/ ux,ny )•  Of  these  contours,  the 
one  yielding  the  maximum  RGI  value  is  chosen  as  the  contour  that  best  delineates  the  lesion 
in  the  initial  step. 

RGI  represents  the  average  proportion  of  the  gradients  in  the  radially  outward  direction. 
The  strategy  of  choosing  maximum  RGI  works  well  for  benign  lesions  as  most  have  circular¬ 
like  shapes  and  smooth  margins.  However,  for  malignant  lesions,  because  of  irregular  shapes 
and  spiculate  margins,  the  resulting  contours  are  usually  under-grown.  Nevertheless,  RGI 
provides  a  good  initial  contour  for  the  following  evolution  driven  by  active  contour  model. 
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C.  Background  estimation 

In  the  active  contour  model,  contour  evolution  relies  on  the  competition  between  the  region 
inside  the  contour  (foreground)  and  that  outside  the  contour  (background).  The  presence  of 
structure  noises  such  as  lymph  nodes,  parenchyma,  and  localization  markers  complicates  the 
background  in  mammgrams.  RGI  segmentation  provides  not  only  the  initial  contour,  but 
also  a  means  to  estimate  the  effective  background  surrounding  the  lesion.  In  our  study,  the 
effective  background  is  defined  as  the  set  of  pixels  within  a  given  distance  d  (pixels)  from 
the  circumscribed  rectangle  of  the  initial  contour,  as  shown  in  Fig.  3. 

[Figure  3  about  here.] 

Distance  d  plays  an  important  role  in  determining  the  effective  background.  On  one 
hand,  a  large  d  yields  a  large  region  and  thus  better  statistics  on  the  background.  On  the 
other  hand,  a  small  d  would  not  be  contaminated  by  nearby  structures.  In  this  study,  an 
automatic  scheme  was  developed  to  determine  the  best  distance  d  from  a  series  of  candidates. 

For  a  series  of  distances  di:  i  —  1, ... ,L ,  two  series  of  regions  can  be  determined,  as  Fig. 
4  (a)  shows.  One  series  of  regions  are  background  candidates  Bi  (Fig.  4  (b)),  and  the  other 
series  are  net  increases  of  background  Bm  (Fig.  4  (c)),  where  BNl  =  F?i+1  —  Bi}  i  =  1, ...,  L— 1. 
Our  method  is  based  on  the  following  two  principles:  With  the  expansion  of  background, 
1)  the  mean  gray  value  of  Bl:  i.e.  mean(Bi ),  should  decrease  as  more  areas  with  lower  gray 
level  are  included;  and  2)  the  standard  deviation  of  BNi  ,  i.e.  std(Bm ),  should  not  change 
substantially  for  relatively  smooth  background.  By  monitoring  mean(Bi)  and  std^Bm)  with 
increasing  ,  two  potential  distance  candidates  are  obtained.  One  candidate  is  defined  as 
the  distance  at  which  mean(Bi)  reaches  a  minimum  value,  and  the  other  candidate  is  defined 
as  the  distance  at  which  std(BNi)  demonstrates  the  maximum  increase,  as  shown  in  Fig.  5. 
At  last,  the  final  distance  is  chosen  as  the  minimum  of  these  two  candidates.  As  for  the 
example  in  Fig.  4,  the  distance  is  automatically  determined  d  =  110  (pixels). 

[Figure  4  about  here.] 
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[Figure  5  about  here. 


D.  Background  trend  correction 

Due  to  the  non-uniformity  of  the  background  distribution,  some  pixels  in  the  background 
have  similar  gray  values  as  in  the  lesion,  which  hinders  the  segmentation  performance  of  the 
active  contour  model.  Thus,  a  two-dimensional  background  trend  correction  was  employed 
prior  to  segmentation.  The  background  trend  is  estimated  by  fitting  a  two-dimensional 
surface  with  a  least-squares  method  to  the  gradual  change  in  the  background  pixel  values 
within  the  extracted  background  estimation  region.  Here,  we  used  a  first-order  ploynomial 
function,  i.e.  f(x,y )  =  a  +  b-x  +  c-y,  to  describe  the  two-dimensional  surface  as  higher  order 
polynomial  functions  will  estimate  mass  lesion  instead.  Fig.  6  demonstrates  the  significance 
of  the  background  trend  correction  when  a  non-uniform  background  is  present. 

[Figure  6  about  here.] 

E.  Dynamic  stopping  criterion 

To  stop  the  evolution  of  a  contour,  a  pre-determined  threshold  is  often  used.  Various  metrics 
can  be  used  to  check  convergence  of  evolution,  such  as  the  change  of  level  set  function  (f)  [30] 
and  the  change  of  length  of  contour  [31].  The  contour  evolution  can  also  be  terminated  when 
the  area  inside  the  contour  differs  from  the  initial  one  by  a  given  value  [32].  In  our  initial 
study,  we  had  ever  defined  a  stopping  criterion  of  relative  foreground  change  (RFC),  which  is 
the  ratio  between  the  change  of  foreground  and  the  area  of  foreground.  Comparing  with  the 
stopping  criterion  of  change  of  contour  length  used  in  [31],  RFC  has  two  advantages:  1)  RFC 
is  a  relative  measure  and  thus  is  more  suitable  for  lesions  with  various  sizes;  2)  RFC  is  more 
computationally  efficient  as  the  aquisition  of  contour  in  [31]  brings  additional  computation. 
No  matter  the  strategy  is  used,  it  is  necessary  to  set  some  threshold  in  advance.  However,  due 
to  varying  sizes  of  lesions  as  well  as  sizes  of  background  obtained  from  automatic  background 
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estimation,  it  is  difficult  to  find  a  fixed  parameter  for  controlling  convergence. 

In  our  preliminary  work  [33],  we  developed  a  dynamic  method  to  terminate  contour 
evolution  automatically.  In  that  work,  as  the  contour  evolves,  mean  values  of  both  foreground 
and  background  will  decrease  gradually.  As  foreground  is  generally  more  homogeneous  than 
the  background,  the  rate  of  foreground  mean  change  is  less  than  that  of  background  mean 
change.  However,  as  the  evolving  contour  crosses  the  lesion  margin,  the  foreground  mean 
will  decrease  faster  than  will  the  background  mean.  Thus,  during  dynamic  contouring,  the 
difference  between  the  rate  of  foreground  mean  change  and  that  of  background  mean  change 
is  tracked,  and  contour  evolution  is  terminated  when  the  decrease  of  foreground  mean  value 
is  more  rapid  than  that  of  the  background  mean  value.  This  method  provides  a  way  to 
terminate  contour  evolution  free  of  pre-defined  threshold.  However,  it  neglects  the  influence 
of  sizes  of  both  foreground  and  background,  and  thus  ceases  contour  evolution  earlier  than 
expected. 

In  order  to  address  this  problem,  we  modified  the  previous  method,  which  we  present 
here  in  one  dimension.  As  Fig.  7  shows,  g(x )  is  a  decreasing  function  defined  on  the  interval 
[0,  L],  and  point  s  is  moving  within  [0,  L\  at  the  speed  of  v.  s  also  splits  [0,  L\  into  two 
regions.  For  simplicity,  the  region  [0,  s]  is  named  region  1,  and  [s,  L\  is  region  2.  Then,  the 
mean  values  of  region  1  and  2  are: 


[Figure  7  about  here. 


.The  slope  of  Ci  is: 


Cl  = 


Jo  fj(x)dx 
s 


c2  = 


JSL  g{x)dx 
L  —  s 
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dci  dci  ds 

dt  ds  dt 

d,fo9(x)dx  _ 

=  —  P - )-v 

ds  s 

=  9(s)~ci  ^ 

s 

Here,  we  use  the  fact  that  v  —  ^  -v,  where  v  is  the  outward  unit  vector.  Similarly,  the  slope 
of  c2  is  : 


dc2  _  g{s )  - 
dt  L  — 

Thus,  the  difference  between  these  two  slopes  is: 


c2 


•  v. 


dci  dc2  fg(s)-c  i  g(s)  -  c2  _ 

Av  =  — - —  = - 1 - - -  •  v. 

dt  dt  s  L  —  s 


(14) 


As  the  discussed  above,  as  s  moves  within  the  object,  we  have  Av  >0.  Ass  moves  across 
the  edge,  Av  will  become  negative.  When  Av  =  0,  we  have  g(s)  =  f-c2+d^--ci  >  |(ci+c2)  as 
in  general  L  —  s  >  s  and  C\  >  c2  .  However,  if  only  the  speed  terms  driven  by  image  property 
in  Equation  (7)  are  considered,  the  evolution  should  stop  at  so  such  that  g(so)  =  |(ci  +  c2). 
Because  of  the  influence  of  sizes,  s  will  stop  moving  quickly  if  the  criterion  in  Equation  (14) 
is  used. 

In  order  to  eliminate  the  influence  of  size,  a  weighted  difference  between  slope  of  c±  and 
that  of  c2  is  introduced  as  : 


Avm 


dc\ 
s  dt 


dc2 

dt 


[2 ' 9<«)  “  (ci  +  <4)1  ■  v. 


(15) 


L  —  s  dt  dt  L  — 

It  can  be  shown  that  Avw  goes  to  zero  at  the  desired  contour  s0)  where  g(so)  =  |(c i  +  c2). 

The  one-dimensional  case,  described  above,  can  be  extended  to  two-dimensional  one. 
During  the  contour  evolution,  the  weighted  difference  between  the  mean  slope  of  foreground 
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and  that  of  backgournd  is  monitored,  and  the  contour  evolution  is  terminated  when  the 
weighted  slope  difference  converges  to  zero. 

F.  Implementation 

In  order  to  calibrate  parameters  in  the  proposed  segmentation  method,  ten  digitized  screen- 
film  mammograms  (SFM)  with  spatial  resolution  of  100 /im  x  100 fim  were  analyzed.  The 
calibrated  segmentation  method  was  then  applied  to  the  entire  FFDM  database  for  indepen¬ 
dent  performance  evaluation. 

In  our  study,  we  kept  both  Ai  and  A2  in  Equation  (7)  to  one  (i.e.  Ai  =  A2  =  1)  since 
the  contribution  of  the  homogeneities  of  inside  and  outside  the  contour  should  be  equally 
considered.  Other  parameters  in  Equation  (7)  were  chosen  as  follows:  e  =  1  and  At  =  0.1, 
where  e  influences  the  Heavyside  function  and  At  controls  how  quickly  the  level  set  function 
changes.  Note  that  //  controls  the  smoothness  of  the  final  contour.  However,  If  one  wants 
to  depict  the  fine  details  of  the  object,  one  should  choose  a  small  /i.  On  the  contrary,  if  one 
wants  to  obtain  a  smoother  contour,  one  should  set  a  large  /i.  As  some  of  our  computer- 
extracted  features,  such  as  spiculation,  characterize  the  fine  details  of  the  lesion  margin,  we 
chose  a  fairly  small  value  of  n,  i.e.  0.001  x  10232,  which  also  allows  for  the  use  of  the  10  bit 
data.  To  ensure  numerical  stability,  the  coefficient  v  must  satisfy  v  ■  At  <  \  [28],  so  we  set 
v  =  2  in  our  study.  The  maximum  number  of  iterations  is  set  to  500. 

G.  Performance  evaluation 

The  performance  of  the  proposed  segmentation  algorithm  was  assessed  by  comparing  the 
computer-delineated  contours  with  the  outlines  drawn  by  an  expert  breast  radiologist.  Be¬ 
sides  visually  evaluating  the  agreement  of  computer-segmented  results  with  radiologist’s 
manually-contoured  lesion  margins,  a  quantitative  measure  was  used  to  evaluate  the  segmen¬ 
tation  performance.  For  a  particular  lesion,  the  area  overlap  ratio  (AOR)  between  manual 
segmentation  and  computer  segmentation  is  defined  as: 
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AOR  =  n  A 

Area(M  U  C) 

where  M  is  the  manually-segmented  contour  and  C  is  the  computer-segmented  contour. 
AOR  ranges  from  zero  to  one,  being  zero  in  the  case  of  no  overlap  and  one  in  the  case  of 
a  perfect  match.  For  the  entire  database,  a  series  of  AOR  thresholds  were  obtained  and 
at  each  AOR  threshold,  the  percentage  of  lesions  “correctly”  segmented  was  calculated  by 
counting  the  number  of  lesions  with  AOR  greater  than  that  threshold. 

IV.  RESULTS 

A.  Evaluation  of  level  set  smoothness 

In  our  study,  a  new  term  Es  =  fn(  1—  ||  V(j>t  1 1 ) dxdy  is  added  to  the  original  active  contour 
model  in  [25],  thus  we  initially  evaluate  the  usefulness  of  this  term.  Two  sets  of  final  contours 
were  extracted  from  the  entire  FFDM  database,  one  was  obtained  with  Es  and  the  other 
without.  The  results  show  that  Eg  can  not  only  provide  a  smoother  contour,  but  also 
push  the  contour  closer  to  the  lesion  margin  with  less  iterations,  yielding  a  mean  number  of 
iterations  160  compared  to  the  mean  numer  of  iterations  327  without  Eg.  In  the  example 
shown  in  Fig.  8,  the  left  figure  shows  the  segmentation  result  without  smoothing  level  set 
function,  which  took  500  iterations.  While  for  the  result  with  smoothing  level  set  function 
in  the  right  figure,  it  only  took  248  iterations  to  converge. 

[Figure  8  about  here.] 

B.  Evaluation  of  dynamic  stopping  criterion 

We  investigated  our  new  stopping  criterion  based  on  the  weighted  slope  difference  between 
foreground  mean  and  background  mean  (Avw),  and  compared  it  to  the  unweighted  slope 
difference  method  as  well  as  the  relative  foreground  change  (RFC).  The  RFC  thresholds  to 
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terminate  contour  evolution  were  set  as  0.05,  and  0.01,  respectively.  During  the  evolution,  we 
recorded  the  contours  using  these  four  stopping  criteria  and  obtained  AOR  with  radiologist’s 
outlines. 

Fig.  9  shows  plots  of  the  fraction  of  correctly  segmented  lesions  at  various  AOR  threshold 
for  the  four  stopping  criteria  (Avw,  Av,  RFC-q, 05  and  RFC0, 0i)  on  the  FFDM  databases. 
For  benign  images,  all  the  criteria  yielded  similar  segmentation  performances  since  the  initial 
contours,  obtained  by  RGI  segmentation,  are  close  to  the  true  lesion  margins.  However,  as 
RGI  segmentation  is  inferior  for  malignant  lesions,  Avw  does  perform  better  among  all  the 
stopping  criteria. 

[Figure  9  about  here.] 

Table  1  summerizes  the  statistical  comparison  (Holm  t  test) [34]  among  these  four  criteria, 
given  the  mean  and  standard  deviation  of  AOR  for  each  criterion.  In  terms  of  area  overlap 
ratio  (AOR),  the  weighted  slope  difference  method  is  statistically  better  than  the  unweighted 
slope  difference  method,  and  the  convergence  rate  at  RFC  =  0.05  (overall  significant  level 
aT  =  0.05).  However,  we  failed  to  show  a  statistically  significant  difference  between  the 
weighted  slope  difference  method  and  the  convergence  rate  at  RFC  =  0.01.  Nevertheless,  if 
the  number  of  iterations  is  taken  into  account,  the  mean  number  of  iterations  for  weighted 
slope  difference  is  156,  while  it  is  280  for  RFC0, 0i-  The  weighted  slope  difference  is  more 
efficient  than  RFCo.oi- 

[Table  1  about  here.] 

C.  Comparative  evaluation  of  the  segmentation  method 

The  segmentation  algorithm  was  compared  with  our  previously-reported  region-growing[6] 
and  RGI-based  segmentation [7]  methods.  Fig.  10  shows  several  examples  of  lesion  segmen¬ 
tations  using  these  three  segmentation  methods.  The  result  of  the  proposed  method  visually 
demonstrates  a  better  agreement  with  the  radiologist’s  outline  of  the  lesion. 
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[Figure  10  about  here. 


Fig.  11  shows  the  fraction  of  lesions  correctly  segmented  at  various  overlap  threshold 
levels.  At  the  overlap  threshold  of  0.4,  for  benign  lesions,  87%  of  the  images  are  correctly 
segmented  with  the  proposed  method,  while  72%  and  81%  of  the  images  are  correctly  seg¬ 
mented  by  the  region-growing  and  RGI-based  methods,  respectively.  For  malignant  lesions, 
84%  of  the  images  are  correctly  segmented  with  the  proposed  method,  while  66%  and  67%  of 
the  images  are  correctly  segmented  by  region-growing  and  RGI-based  methods,  respectively. 

[Figure  11  about  here.] 

Table  2  gives  the  statistical  comparison  (Holm  t  test) [34]  for  AOR  means  from  the  three 
segmentation  methods.  The  improvement  of  AOR  with  the  proposed  method  was  found  to 
be  statistically  significant  (overall  significant  level  aT  =  0.05). 

[Table  2  about  here.] 


V.  DISCUSSION 

We  developed  a  dual-stage  segmentation  method  to  efficiently  segment  mass  lesions  from 
the  parenchymal  surround  in  FFDM  images.  Our  proposed  method  includes  a  geometric 
active  contour  model,  which  includes  analysis  of  homogeneities  both  inside  and  outside  of 
the  evolving  contour.  The  application  of  RGI-basecl  segmentation  to  provide  initial  contour 
not  only  improves  the  computational  efficiency,  but  also  provides  a  method  with  which  to 
estimate  the  effective  background  about  the  lesion  and  to  suppress  unrelated  pixel  values. 
Also,  our  automatic  stopping  criterion  is  lesion-specific,  and  does  not  rely  on  fixed  iterations. 

As  the  results  show,  the  term  Eg  in  the  active  contour  model  plays  an  important  role  for 
effective  and  efficient  segmentation.  As  ||  V</>  ||>  1,  div[(l  —  pU )V0]  will  evolve  the  level  set 
function  <\>  towards  reducing  ||  V0  ||,  thus  to  smooth  q 7  The  larger  the  gradient  magnitude 
of  level  set  function,  the  more  it  will  be  smoothed.  While  as  ||  Vd>  ||<  1,  div[(  1  —  ^A_)V0] 
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will  evolve  the  level  set  function  towards  increasing  ||  V</>  ||  to  maintain  the  gradient  of  the 
level  set  function  to  some  level.  This  mechanism  ensures  the  level  set  function,  and  thus  the 
final  contour,  to  be  relatively  smooth.  Meanwhile,  as  ||  S7<j>  ||  is  restricted  in  magnitude,  the 
foreground  has  the  potential  to  grow  faster. 

It  should  be  noticed  that  the  weighted  slope  difference  Avw  is  always  non-negative  as 
long  as  g(x)  is  a  decreasing  function.  In  the  active  contour  model,  if  only  the  speed  term 
driven  by  image  property  is  considered,  the  speed  of  contour  can  be  simplified  as: 


v  =  [(#(«)  -  c2)2  -  (g(s)  -  ci)2]  •  v 
=  (ci  -  c2)  •  [2  •  g(s)  -  (ci  +  c2)]  •  v 

where  v  is  the  outward  unit  vector.  Inserting  v  into  Equation  (15),  we  have: 

Avw  =  — - - (d  -  c2)  •  [2  •  g(s)  -  (ci  +  c2)]2  •  v  >  0. 

L  —  s 

If  v  is  driven  by  other  image  property,  such  as  edge  information,  this  relationship  still  holds. 
When  g(s)  >  |(ci  +  c2),  i.e.  s  is  within  the  object,  the  contour  will  move  outward  to  the 
edge,  thus,  we  have  Avw  >  0.  While  if  g(s)  <  |(ci  +  c2),  i.e.  s  is  out  of  object,  it  will  move 
inward  to  the  edge,  we  will  also  have  Avw  >  0.  So  the  weighted  slope  difference  also  provides 
a  general  mechanism  for  terminating  contour  evolution  with  other  active  contour  models. 

In  this  study,  we  empirically  compared  the  segmentation  performance  of  the  proposed 
method  with  our  previously-reported  region  growing[6]  and  RGI-based[7]  segmentation  meth¬ 
ods.  However,  it  is  impossible  for  us  to  perform  empirical  comparisons  between  our  method 
and  those  reviewed  in  the  introduction  section,  as  we  do  not  have  codes  of  those  methods. 
Timp’s  method[10]  uses  polar  coordinate  and  restricts  the  mass  sizes  within  certain  range, 
thus  one  would  expect  their  method  to  work  better  for  lesions  with  circular-like  margins. 
However,  for  lesions  with  irregular  shapes  or  very  large  sizes,  their  method  may  have  dif- 
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ficulty.  Our  dual  stage  segmentation  method  is  able  to  handle  this  situation  by  further 
evolving  the  contour  via  the  active  contour  model.  For  the  fuzzy-set-basecl  methods  devel¬ 
oped  by  Guliato  et  al.  [11],  both  of  them  need  to  preset  some  thresholds  such  as  the  grav-level 
threshold  in  the  first  method  and  the  maximum  allowed  difference  between  the  value  of  the 
pixel  being  analyzed  and  the  mean  of  the  sub-region  in  the  second  method,  which  prevents 
these  methods  from  being  applied  in  a  large  database.  Their  two  thresholds  were  manually 
selected  case  by  case  in  their  evaluation  using  a  database  with  47  mammograms.  On  the 
other  hand,  our  method  is  flexible  in  that  no  threshold  need  to  be  set  in  advance. 

In  our  preliminary  study  [35],  we  compared  two  radiologists’  outlines  with  a  digitized 
screen-film  mammograms  (SFM)  database,  which  consisted  of  29  benign  (51  mammograms) 
and  55  malignant  (96  mammograms)  lesions.  At  an  overlap  threshold  of  0.4,  96.6%  of 
lesion  images  were  correctly  segmented  by  one  radiologist  in  comparison  with  the  other. 
This  result  indicates  that  the  radiologists  highly  agreed  on  the  lesion  margins  for  SFM.  We 
could  expect  that  the  radiologists  would  also  agree  on  the  lesion  margins  for  FFDM  as  the 
manufacturer  has  pre-processed  the  FFDM  images  to  make  them  appear  to  radiologist  as 
traditional-looking  SFM  mammographs. 

When  we  developed  the  proposed  segmentation  algorithm,  the  FFDM  database  was  being 
constructed,  so  our  method  was  initially  calibrated  and  tested  with  the  SFM  database  [33]. 
After  building  the  FFDM  database,  we  randomly  picked  three  groups  of  FFDM  images,  each 
of  which  consisted  of  five  benign  and  five  malignant  images,  and  evaluated  the  segmentation 
performance  using  the  proposed  method  calibrated  with  SFM  images.  The  results  were 
similar  with  what  we  had  obtained  with  SFM  images.  Thus,  we  believe  that  the  parameters 
obtained  by  SFM  also  work  with  FFDM  images,  which  was  subsequently  validated  by  the 
independent  evaluation  with  the  entire  FFDM  database. 

Our  results  could  be  partially  explained  by  the  pre-processing  of  FFDM  images,  which 
is  performed  by  the  manufacturers.  After  pre-processing,  the  gray-level  range  and  constrast 
of  FFDM  images  become  similar  to  those  of  SFAI  images,  which  ensures  the  possibility  of 
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applying  parameters  from  SFM  images  to  FFDM  images  as  grav-level  range  and  contrast 
are  two  key  components  used  in  our  proposed  lesion  segmentation  method.  Our  results  also 
show  the  robustness  of  the  proposed  method  as  it  mainly  uses  the  global  information  of 
images. 


VI.  CONCLUSION 

In  this  paper,  we  present  a  new  lesion  segmentation  method  based  on  a  geometric  active 
contour  model,  which  includes  an  initial  RGI  segmentation,  background  estimation,  back¬ 
ground  trend  correction,  and  a  dynamic  stopping  criterion.  Evaluation  with  a  large  number 
of  FFDM  images  has  shown  that  the  proposed  method  is  statistically  superior  to  our  previous 
region-growing  and  RGI-based  algorithms  in  terms  of  overlap  ratios  obtained  in  compari¬ 
son  with  expert’s  manual  outlines.  At  an  overlap  threshold  of  0.4,  85%  of  the  images  are 
correctly  segmented  by  the  proposed  method,  while  only  69%  and  73%  of  the  images  are 
correctly  segmented  by  our  previous  region-growing  and  RGI-based  methods,  respectively. 
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APPENDIX 


In  this  part,  we  provide  the  details  of  the  derivation  from  energy  function  (4)  to  the  associated 
Euler-Lagrange  equation  (5).  For  convenience,  we  restate  Equation  (4)  here  as: 


Ee(ci,c2,<f>)  =  /  [n- 5e((p(x,y))  ||  V<j>(x,y)  || 

Jvl 

+  ||  V0(s,y)  ||)2 

+Ai  •  \fo(x,y)  -  ci\2He{(f){x,y)) 

+A2  •  \  fo(x,  y)  -  c2 1 2 ( 1  -  He(</>(x,  y)))]dxdy.  (17) 

We  define  F(0,  V0,  a;,  ?/ )  as: 

F(0,V0,x,;?/)  =  /z<$e(0)  ||  V0  ||  +-(l-  ||  ||)2 

+A1|/0  -  ci\2He((p)  +  A2|/0  -  c2|2(1  -  ife(0)).  (18) 

For  simplicity,  we  have  omitted  the  independent  variables  (x,y)  of  0  and  fo-  According 
to  Calculus  of  Variations,  the  scalar  function  (j>(x,y)  that  minimizes  Ee(ci,c2,(j>)  solves  the 

PDE: 


d  ,  OF  ,  d  ,dF 


dF 


=  0. 


dx  d(f)x  dy  d <py  d(f> 

Taking  the  partial  derivative  of  F  with  respect  to  <f)x,  (j)y  and  0,  respectively,  we  have: 


(19) 


dF 

94>x 


+  f  (0x 


dF 

d(f)y 


M</>)  ||  ^  ||  +v((f)y 


II  V0 
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dF 

d<fi 


—  d  II  V</>  ||  de(cfi)  +  [Ai(/0  —  ci)2  —  A2(/o  —  c2)2]<5e(A>) 


where  ^  and  we  use  the  relation  ||  V</>  ||=  +  <fiy. 

The  partial  derivative  of  with  respective  to  x  is: 


(20) 


^  /  dF  v  c'/  /\  ^ x  i  X  /  ^  \  i  ^  r  / 

dx'dffi  ~  **  «W]fWI  +  **  ‘  ^  *  FvUi*  +  “  ||  || 


(21) 


Similarly,  we  have: 


A/W  ^'^ll  v).|| +f"5eW%(||  v^ii^'"#1^  ||  v!)  || 


(22) 


Inserting  (20)  -  (22)  back  to  (19),  we  obtain: 


0  =  M<fi)[ 
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+/d5t((j)) 


d  ,  (p't 
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r  +37  i 


i*cii  ir 1  ds/ui  vA  id1 
+,,{*1^  “  n  W II1  +  ~  ||  vA  ||1} 

— <5e(0)[Ai(/o  —  Ci)2  —  A2(/o  —  c2)2]. 
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By  noticing  that: 
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—[(f) 

dxlx 


,  d  r  , 

rl  +  *A“ 


Vcf)  || J  dylTy  ||  V0 
we  finally  obtain  the  compact  form  of  (23)  as: 


-]  =  div[(  1  — 


V0 


r)V0] 


0  =  5e(0)[//  •  div{  ^  )  -  Ai(/0  -  ci)2  +  A2(/0  -  c2)2]  +  v  ■  div[(  1  -  ^  ,,)V0].  (24) 
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Figure  2:  Schematic  diagram  of  the  proposed  dual-stage  lesion  segmentation  algorithm. 
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Figure  3:  Illustration  of  defining  the  effective  background.  In  this  figure,  the  solid  line 
represents  the  initial  contour  obtained  by  RGI  segmentation  and  the  dash-dotted  rectangle 
is  the  circumscribed  rectangle  of  this  initial  contour.  The  effective  background  is  defined  as 
the  region  inside  the  dashed  rectangle  excluding  the  region  within  the  initial  contour.  An 
automatic  scheme  is  employed  to  determine  the  best  d. 
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(a)  (c) 

Figure  4:  The  illustration  of  determining  the  distance  d.  (a)  a  mammogram  with  a  series  of 
distances  dt.  in  which  the  thick  dashed  rectangle  represents  the  computer-selected  distance  d 
;  (b)  Bi\  the  ith  background  candidate  corresponding  to  di  ;  (c)  Bjyt:  the  ith  net  background 
increase.  Background  is  defined  as  the  set  of  pixels  within  a  given  distance  di  (pixel)  from 
the  circumscribed  rectangle  of  the  initial  contour. 
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Figure  5:  Left:  the  trend  of  mean  value  of  Bi}  the  ith  background  candidate  with  respect  to 
distance  d*.  Right:  the  trend  of  standard  deviation  of  Bm,  the  ith  net  background  increase 
with  respect  to  B,  and  Bi+ 
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(C)  (d) 

Figure  6:  An  example  of  the  effect  of  background  trend  correction  on  segmentation,  (a)  the 
original  ROI  ;  (b)  segmentation  result  of  (a);  (c)  the  processed  ROI  after  background  trend 
correction;  and  (d)  segmentation  result  of  (c). 
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Figure  7:  The  illustration  of  determining  the  stopping  point.  g{x)  is  a  decreasing  function 
defined  on  [0,  L]  and  s  6  [0,  L\  is  a  moving  point  with  speed  of  v. 
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Figure  8:  An  example  of  the  effect  of  level  set  smoothness  to  the  final  segmentation  re¬ 
sults.  Left:  segmentation  without  level  set  smoothness;  Right:  segmentation  with  level  set 
smoothness. 
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Figure  9:  Segmentation  performance  of  four  different  stopping  criteria  in  terms  of  area 
overlap  ratio  (AOR)  on  a  clinical  FFDM  database.  In  both  plots,  Avw  is  the  weighted  slope 
difference  between  foreground  mean  value  and  background  mean  value,  in  which  foreground 
is  the  area  within  the  evolving  contour  and  background  is  the  area  outside  contour;  Av  is  the 
unweighted  slope  difference  between  these  two  mean  values.  RFC0  0i  stands  for  a  stopping 
criterion  that  terminates  contour  from  evolution  when  the  relative  foreground  change  (RFC) 
is  not  greater  than  0.01.  Similarly,  iLFCo.05  stops  the  contour  evolution  when  RFC  is 
not  greater  than  0.05.  Left:  evaluated  011  327  benign  images;  Right:  evaluated  on  412 
malignant  images.  The  results  show  that  the  weighted  slope  difference  is  statistically  superior 
to  unweighted  slope  difference  and  convergence  rate  at  RFC  =  0.05  on  malignant  images. 


38 


(a)  (b)  (c)  (d) 

Figure  10:  Segmentation  results  for  5  malignant  lesion  examples,  (a)  radiologist’s  outline,  (b) 
region-growing,  (c)  RGI-based  segmentation  and  (d)  the  proposed  dual-stage  segmentation 
method 
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Figure  11:  Performance  of  three  different  segmentation  methods  in  terms  of  area  overlap 
ratio  (AOR)  on  a  clinical  FFDM  database.  Left:  evaluated  on  327  benign  images;  Right: 
evaluated  on  412  malignant  images.  The  results  show  that  the  dual-stage  segmentation 
method  is  statistically  superior  to  both  region-growing  and  RGI-based  method. 
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Table  I:  Statistical  comparison  of  the  performance  of  four  stopping  criteria  in  the  dual- 
stage  segmentation  in  terms  of  average  area  overlap  (AOR),  and  p- values  are  given  for 
the  comparison  of  the  weighted  slope  difference  with  any  other  stopping  criterion.  The 
significant  level  for  the  individual  paired  t  test  is  calculated  using  Holm’s  procedure 
(overall  a1  =  0.05).  Same  convention  as  Fig.  9. 


/\vw 

Av 

RFCo.oi 

RFCom 

Benign 

mean  ±  std 

0.61  ±0.19 

0.61  ±0.19 

0.61  ±0.19 

0.61  ±0.19 

p- value 

— 

0.856 

0.801 

0.601 

sig.  lev.  (oii) 

— 

— 

— 

— 

Malignant 

mean  ±  std 

0.59  ±0.19 

0.53  ±0.20 

0.57  ±0.19 

0.52  ±0.20 

p- value 

— 

<  0.001 

0.192 

<  0.001 

sig.  lev.  (. a.i ) 

— 

0.05 

— 

0.025 

All 

mean  ±  std 

0.60  ±0.19 

0.57  ±0.20 

0.59  ±0.19 

0.56  ±0.20 

p- value 

— 

0.002 

0.25 

<  0.001 

sig.  lev.  (a*) 

— 

0.05 

— 

0.025 
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Table  II:  Statistical  comparison  of  the  three  lesion  segmentation  algorithms.  Performance 
is  given  by  average  area  overlap  ratio,  and  values  are  given  for  the  comparison  of  the 
dual-stage  segmentation  with  the  previous  region-growing  and  RGI-based  method.  The 
significant  level  at  for  the  individual  paired  t  test  is  calculated  using  Holm’s  procedure 
(overall  aT  =  0.05). 


Dual-stage  segmentation 

RGI 

Region-growing 

Benign 

mean  ±  std 

0.61  ±0.19 

0.58  ±0.19 

0.51  ±0.20 

p- value 

— 

0.01 

<  0.001 

seg.  lev.  (ati) 

— 

0.05 

0.025 

Malignant 

mean  ±  std 

0.59  ±0.19 

0.48  ±0.20 

0.49  ±0.20 

p- value 

— 

<  0.001 

<  0.001 

seg.  lev.  (an) 

— 

0.025 

0.05 

All 

mean  ±  std 

0.60  ±0.19 

0.52  ±0.20 

0.50  ±0.20 

p- value 

— 

<  0.001 

<  0.001 

seg.  lev.  (an) 

— 

0.05 

0.025 

43 


Appendix  B:  RSNA  2006 


TITLE: 

Feature  correlation  on  multiple- view  FFDM  images 

AUTHORS: 

Yading  Yuan,  Maryellen  L.  Giger,  Hui  Li  and  Charlene  Sennett 
PURPOSE:  (473/2200) 

The  objective  of  this  study  is  to  evaluate  the  correlation  performance  of  individual 
computerized  features  extracted  from  the  full  field  digital  mammograms  (FFDM)  of  a 
lesion  obtained  in  two  different  views.  This  research  provides  a  guide  for  discriminating 
corresponding  and  non-corresponding  lesion  pairs  within  the  CAD  framework.  It  is  also 
helpful  for  guiding  the  development  of  new  features  to  improve  the  accuracy  of  image 
matching  in  disease  diagnosis  and  prognosis. 


METHOD  AND  MATERIALS:  (1234/2200) 

One  dataset  (A)  includes  103  biopsy  proven  cases  (48  benign  solid  lesions  and  55 
malignant  lesions),  each  of  which  has  a  craniocaudal  (CC)  and  mediolateral  (ML)  view. 
Another  dataset  (B)  includes  52  cases  (24  benign  solid  lesions  and  28  malignant  lesions), 
each  of  which  has  a  CC  and  mediolateral  oblique  (MLO)  view.  In  order  to  evaluate  the 
robustness  of  the  correlation  performance  to  lesion  segmentation,  besides  the 
radiologist’s  outlines,  three  automatic  segmentation  methods  were  employed  to  extract 
the  mass  lesion  from  the  surrounding  tissues.  The  conventional  region-growing  method 
uses  abrupt  changes  in  size  and  circularity  as  the  rules  of  segmentation.  The  radial 
gradient  index  (RGI)  based  method  applies  RGI  model  to  the  suspicious  lesion  multiplied 
by  a  constraint  function.  The  region-based  active  contour  model  evolves  the  contour 
based  on  the  homogeneities  both  inside  and  outside  of  the  evolving  contour.  Fifteen 
computer-extracted  features  of  each  lesion  were  calculated  in  both  views  in  order  to 
quantify  the  characteristics  of  margin,  shape,  contrast  and  texture  of  the  lesion.  For  each 
feature,  correlation  coefficient  between  the  two  views  and  the  p-value  of  the  derived 
correlation  coefficient  were  obtained. 


RESULTS:  (672/2200) 

With  the  human  outline,  the  feature  characterizing  the  diameter  of  lesion  yielded  the 
correlation  efficient  of  0.87  for  dataset  A  and  0.88  for  dataset  B,  both  of  which  have  p- 
values  far  less  than  0.05.  The  features  characterizing  shape,  contrast  and  texture  showed 
better  performance  among  the  15  individual  features  despite  of  segmentation  methods, 
pathology  and  the  type  of  view  pairs.  This  is  because  the  features  representing  large-scale 
information  are  less  sensitive  to  the  change  of  position  than  those  representing  small- 


scale  information,  which  results  in  the  higher  correlation  between  large-scale  features 
from  different  views  than  that  of  small-scale  features. 

CONCLUSIONS:  (301/2200) 

Our  investigation  indicates  that  the  features  that  characterize  the  large-scale  information 
of  lesion  have  higher  correlation  between  the  two  view  images.  We  are  currently 
applying  these  features  to  develop  automated  image  matching  method  to  determine 
corresponding  and  non-corresponding  lesion  pairs. 


Appendix  C:  SPIE  Medical  Imaging  2008  (Submitted) 


TITLE: 

Correlative  feature  analysis  of  FFDM  images 
AUTHORS: 

Yading  Yuan,  Maryellen  L.  Giger,  Hui  Li  and  Charlene  Sennett 

KEYWORDS: 

Mammography,  correlative  feature  analysis,  computer-aided  diagnosis 

ABSTRACT: 

Identifying  the  corresponding  image  pair  of  a  lesion  is  an  essential  step  for  combining  information  from 
different  views  of  the  lesion  to  improve  the  diagnostic  ability  for  both  radiologists  and  CAD  systems.  Because 
of  the  non-rigidity  of  the  breasts  and  the  2D  projective  property  of  mammograms,  this  task  is  not  trivial.  In 
this  study,  we  present  a  computerized  framework  that  differentiates  the  corresponding  images  from  different 
views  of  a  lesion  from  non-corresponding  ones.  A  dual-stage  segmentation  method,  which  employs  an  initial 
radial  gradient  index  (RGI)  based  segmentation  and  an  active  contour  model,  was  firstly  applied  to  extract 
mass  lesions  from  the  surrounding  tissues.  Then  various  lesion  features  were  automatically  extracted  from 
each  of  the  two  views  of  each  lesion  to  quantify  the  characteristics  of  margin,  shape,  size,  texture  and  context 
of  the  lesion,  as  well  as  its  distance  to  nipple.  We  employed  a  two-step  method  to  select  an  effective  subset  of 
features,  and  combined  it  with  a  BANN  to  obtain  a  discriminant  score,  which  yielded  an  estimate  of  the 
probability  that  the  two  images  are  of  the  same  physical  lesion.  ROC  analysis  was  used  to  evaluate  the 
performance  of  the  individual  features  and  the  selected  feature  subset  in  the  task  of  distinguishing 
corresponding  pairs  from  non-corresponding  pairs.  By  using  a  FFDM  database  with  124  corresponding  image 
pairs  and  35  non-corresponding  pairs,  the  distance  feature  yielded  an  AUC  (area  under  the  ROC  curve)  of 
0.80  with  leave-one-out  evaluation,  and  the  feature  subset,  which  includes  distance  feature,  lesion  size  and 
lesion  contrast,  yielded  an  AUC  of  0.86. 

DISCRETION  OF  PURPOSE: 

Merging  information  from  different  views  of  a  lesion  has  been  widely  recognized  to  allow  radiologists  to 
better  detect  and  evaluate  breast  abnormalities  in  FFDM  images.  However,  since  a  mammogram  represents 
the  2D  projection  of  the  3D  distribution  of  attenuation  coefficient,  as  well  as  the  breast  being  a  non-rigid 
object,  the  conventional  image-registration  techniques  are  not  appropriate.  In  this  study,  we  propose  a 
computerized  scheme,  which  relies  on  computer-extracted  features  instead  of  the  original  image,  to  determine 
if  an  image  pair  from  different  views  represents  the  same  lesion. 

METHOD(S): 

A  dual-stage  segmentation  method  was  firstly  applied  to  extract  lesions  from  the  surrounding  tissues.  This 
algorithm  utilizes  a  geometric  active  contour  model  that  maximizes  an  energy  function  based  on  the 
homogeneities  inside  and  outside  of  the  evolving  contour.  Prior  to  the  application  of  the  active  contour  model, 
a  RGI-based  method  is  applied  to  yield  an  initial  contour  close  to  the  lesion  boundary  location  in  a 
computationally  efficient  manner. 

Three  groups  of  computer-extracted  lesion  features  were  used  in  our  study.  The  first  group  includes  features 
characterizing  spiculation,  margin,  shape  and  contrast  of  a  lesion,  which  are  widely  used  for  the  task  of 
distinguishing  between  malignant  and  benign  lesions.  The  second  group  includes  texture  features  extracted 
from  various  regions  including  the  lesion,  the  surrounding  neighborhood  of  the  lesion,  and  the  entire  ROI, 
respectively.  For  each  region,  a  2D  gray-level  co-occurrence  matrix  (GLCM)  was  constructed,  and  texture 
features  were  extracted  to  quantify  the  spatial  dependence  of  gray-level  values.  We  developed  an  automatic 


neighborhood  estimation  method  to  determine  the  effective  circumstance  of  the  lesion.  The  third  group 
includes  a  distance  feature  calculated  as  the  Euclidean  distance  from  the  nipple  location  to  the  center  of  the 
lesion.  A  nipple  searching  method  was  developed  to  identify  the  nipple  location  automatically. 

A  two-step  method  was  employed  for  feature  selection.  A  classifier  was  firstly  applied  to  each  single  feature 
pair  from  different  views,  yielding  a  “correspondence”  feature  that  represents  the  probability  of  corresponding 
pairs.  Then,  a  linear  stepwise  feature  selection  method  was  used  to  select  the  effective  subset  of  these 
correspondence  features. 

We  used  the  BANN  as  our  classifier,  which  incoiporates  Bayesian  inference  to  avoid  the  problem  of  “over 
fitting”.  Receiver  operating  characteristic  (ROC)  analysis  was  used  to  assess  the  performance  of  the 
individual  features  and  the  selected  feature  subset  in  the  task  of  distinguishing  corresponding  pairs  from  non¬ 
corresponding  pairs. 

RESULTS: 

In  our  preliminary  study,  we  tested  the  proposed  scheme  using  a  FFDM  database,  which  includes  1 3 1  biopsy- 
proven  lesions  (63  benign  and  68  malignant).  From  this  database,  we  constructed  124  corresponding  pairs  and 
35  non-corresponding  pairs.  Each  pair  consists  of  a  craniocaudal  (CC)  view  and  a  mediolateral  (ME)  view. 
Considering  the  most  realistic  scenario  of  lesion  mismatch  in  clinical  practice,  the  non-corresponding  pairs 
were  constructed  from  cases  of  the  same  patients  but  different  physical  lesions. 

The  correlation  between  the  distance  feature,  from  the  automatic  nipple  identification  method  and  those  from 
manual  nipple  identification  was  0.997  (p<0.0005).  In  leave-one-out  evaluation  by  lesion,  the  distance  feature 
outperformed  among  all  the  single  features,  yielding  an  AUC  of  0.80.  Distance  feature,  lesion  size  and  lesion 
contrast  were  selected  as  the  effective  feature  subset  and  yielded  an  AUC  of  0.86.  The  improvement  by  using 
multiple  features  was  statistically  significant  compared  to  single  feature  performance  (p  =  0.0075). 

NEW  OR  BREAKTHROUGH  WORK  TO  BE  PRESENTED: 

Our  study  includes  three  attractive  features:  1)  This  correlative  feature  analysis  (CFA)  framework  is  based  on 
computer-extracted  features  instead  of  original  images,  which  is  different  from  conventional  image 
registration  in  which  the  registration  is  to  align  two  images  known  to  represent  the  same  object,  while  the  task 
of  CFA  is  to  evaluate  the  probability  that  the  given  two  images  represent  the  same  object;  2)  The  newly 
developed  distance  feature  improves  the  performance  of  single  feature  from  AUC  of  0.71  (lesion  size)  to  0.80 
with  leave-one-out  evaluation  (p  =  0.04);  3)  The  first  step  of  the  new  two-stage  feature  selection  method 
effectively  reduces  the  dimensionality  of  the  feature  space,  and  thus  improves  the  performance  to  AUC  of 
0.86  with  leave-one-out  evaluation,  as  compared  with  0.76  when  applying  the  original  features  directly 
(p  0.03). 

CONCLUSIONS: 

We  have  presented  a  correlative  feature  analysis  framework  to  estimate  the  probability  that  a  given  pair  of 
two  images  as  of  the  same  physical  lesion,  and  our  investigation  indicates  that  the  proposed  method  is  a 
promising  way  to  distinguish  between  corresponding  and  non-corresponding  pairs.  We  are  collecting  more 
cases  to  evaluate  our  method  on  a  larger  scale. 
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